Direct and Transposed Sparse Matrix-Vector Multiplication

نویسندگان

  • Sorin Cotofana
  • Pyrrhos Stathis
  • Stamatis Vassiliadis
چکیده

In this paper we investigate the execution of Ab and AT b, where A is a sparse matrix and b a dense vector, using the Blocked Based Compression Storage (BBCS) scheme and an Augmented Vector Architecture (AVA). In particular, we demonstrate that by using the BBCS format, we can represent both the direct and the transposed matrix for the purposes of matrix-vector multiplication with no additional costs in storage, access time and computation performance. To achieve this, we propose a new instruction and a hardware modification for the AVA. Subsequently we evaluate the performance of the transposed Sparse Matrix Vector Multiplication (SMVM) and demonstrate that like for the direct SMVM, the BBCS scheme outperforms other general schemes like the Jagged Diagonal (JD) and the Compressed Row Storage (CRS) by 1.7 to 4.1 times. Furthermore we show that the BBCS scheme outperforms CRS and JD when the aforementioned SMVM is used in the Conjugate Gradient and Bi-Conjugate Gradient iterative solve algorithms for which speedups of 1.78 to 4.13 depending were achieved in simulations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SIMD Parallel Sparse Matrix-Vector and Transposed-Matrix-Vector Multiplication in DD Precision

We accelerate a double precision sparse matrix and DD vector multiplication (DD-SpMV), and its transposition and DD vector multiplication (DD-TSpMV) by using SIMD AVX2 for Krylov subspace methods. We compare some storage formats of DD-SpMV and DDTSpMV for AVX2 to eliminate performance degradation factors in CRS. Our experience indicates that BCRS4x1, with fitting block size to the SIMD register...

متن کامل

Efficient multithreaded untransposed, transposed or symmetric sparse matrix-vector multiplication with the Recursive Sparse Blocks format

In earlier work we have introduced the “Recursive Sparse Blocks” (RSB) sparse matrix storage scheme oriented towards cache efficient matrix-vector multiplication (SpMV ) and triangular solution (SpSV ) on cache based shared memory parallel computers. Both the transposed (SpMV T ) and symmetric (SymSpMV ) matrix-vector multiply variants are supported. RSB stands for a meta-format: it recursively...

متن کامل

Sparse Matrix-vector Multiplication on Nvidia Gpu

In this paper, we present our work on developing a new matrix format and a new sparse matrix-vector multiplication algorithm. The matrix format is HEC, which is a hybrid format. This matrix format is efficient for sparse matrix-vector multiplication and is friendly to preconditioner. Numerical experiments show that our sparse matrix-vector multiplication algorithm is efficient on

متن کامل

Preconditioned Conjugate Gradient

.. ................................................................................................................... ix Chapter 1. Introduction ..................................................................................................1 Chapter 2. Background ..................................................................................................6 2.1. Matrix Compu...

متن کامل

Sparse Data Structures for Weighted Bipartite Matching

Inspired by the success of blocking to improve the performance of algorithms for sparse matrix vector multiplication [5] and sparse direct factorization [1], we explore the benefits of blocking in related sparse graph algorithms. A natural question is whether the benefits of local blocking extend to other sparse graph algorithms. Here we examine algorithms for finding a maximum-weight complete ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002